Method and apparatus for enabling an internet web server to keep an accurate count of page hits

ABSTRACT

A method for enabling a server to maintain an accurate count of web-page views. The server maintains a counter that records the number of times the page is accessed due to automatic browser refreshes, and a counter that records the number of times the page is accessed otherwise. A browser accesses the URL of the web page and records a timestamp. When the browser next accesses the page, it determines the present time, and subtracts the timestamp from the present time. The browser then compares the difference with bounds that reflect a window of tolerance about an expected content-update period of the web page. If the difference is within the bounds, the browser instructs the server to advance the automatic-refresh counter; otherwise, the server advances the page-hit counter.

FIELD OF THE INVENTION

[0001] The present invention applies to the Internet and the field ofWorld Wide Web content servers, and more particularly to a method andapparatus for improving the accuracy with which a server counts thenumber of times that clients access a web page.

BACKGROUND

[0002] A significant share of the cost of operating the World Wide Webis underwritten by advertising. For example, the cost of providinginformational web pages may be supported by displaying advertisements toeach Internet user who accesses the web page. For this method to haveintegrity, the web-page provider must be able to give the advertiser anaccurate account of how often the web page is visited, and presumablyhow many times the advertisement is viewed.

[0003] Providing an accurate count becomes difficult when the serverperiodically updates the content of the web page, for example when anHTML web page gives scores for sports events in progress. In this case,web browsers may automatically refresh to keep up with changes in thescores by re-accessing the web page periodically, commensurate with thecontent-update period of the web page. Each time a browser refreshes byre-accessing the web page, the server advances its count of how manytimes the page is viewed. Often, however, no human is present to see theadvertisements when the browser automatically refreshes the updated webpage. This means that a distortion is introduced by coupling the countof how many times the web page and its advertisements are viewed to thecount of how many times the page is accessed.

[0004] Thus, in order to provide greater accuracy in counting how manytimes a web page is viewed, there is a need for a method and apparatusfor determining whether an access to a web page is made explicitly underthe control of a human user or is the result of an automatic web browserrefresh.

SUMMARY OF THE INVENTION

[0005] The present invention provides method and apparatus enabling aserver to maintain an accurate count of the number of times a web pageis viewed, by factoring-out accesses to the web page that result fromautomatic browser refreshes.

[0006] Clients having web browsers connect to a server that provides aweb page over the Internet. According to the invention, the servermaintains two counters in association with the web page: anautomatic-refresh counter that records the number of page accessesgenerated by browsers due to automatic refreshes, and a page-hit counterthat records the number of page accesses not due to automatic refreshoperations by the browsers, which are presumed to be page accessesinitiated explicitly by a human user.

[0007] When a client first accesses the URL associated with the web pagein question, the client determines the present time according to theclient's internal clock, and records this time as a timestamp. Theclient then reads information from the web page regarding the rate atwhich the server updates the content of the web page, in particular thepage's content-update period. Using this information, the clientconfigures its web browser to refresh automatically according to thecontent-update period of the web page.

[0008] The client then monitors for the occurrence of two conditions:(a) manual access of the web page initiated by a user who clicks on arefresh button (or, equivalently, accesses a book mark, or a link fromanother page, or an entry in the web browser's history, and so forth),and (b) the expiration of a clock that indicates time for automaticrefresh. When either (a) or (b) occurs, the client sends a new requestto the server to access the web page. In preparing for the request, theclient reads the previously recorded timestamp, and determines thepresent time. The client then subtracts the timestamp from the presenttime, to provide a difference, and compares the difference withpreestablished bounds that reflect a window of tolerance about thecontent-update period of the web page. If the difference falls withinthe bounds, the client instructs the server to advance theautomatic-refresh counter; if the difference falls outside the bounds,the client instructs the server to advance the page-hit counter. Theclient then overwrites the timestamp with the present time, forreference when (a) or (b) occurs again.

[0009] These and other aspects and advantages of the invention will bemore fully appreciated when considered in light of the followingdrawings and detailed description.

BRIEF DESCRIPTION OF THE DRAWINGS

[0010]FIG. 1 shows a web server according to the invention, connectedvia the Internet to a client that has a web browser.

[0011]FIG. 2 shows an exemplary method by which the client providesinformation that enables the server to improve the accuracy of the countof page hits.

[0012]FIG. 3 shows an exemplary way of distinguishing between a page hitand an automatic refresh.

[0013]FIG. 4 shows aspects of the operation of a server that is improvedaccording to the present invention.

DETAILED DESCRIPTION

[0014] The present invention provides a more accurate way of countinghow many time Internet users access and view a web page by keeping, andfactoring-out, a count of accesses to the web page caused by automaticweb browser refreshing.

[0015]FIG. 1 shows a client 100 connected, via the Internet 120, to aserver 130. The client 100 may be a personal computer, and may include aweb browser 110 for accessing a web page provided by the server 130.Although FIG. 1 shows connection via the Internet 120, connection mayalso be provided in other ways, for example by an Intranet or by anyother communication network.

[0016] The server 130 according to an embodiment of the presentinvention includes a page-hit counter 132 and an automatic-refreshcounter 134, where both are associated with a web page provided by theserver 130. The purpose of the counters is to enable the server 130 tokeep an accurate tally of how many times clients view the web page.

[0017] The server 130 keeps a count of page hits by advancing thepage-hit counter 132. Here, a “page hit” is an access of the web pagemade under the deliberate control of a user. Examples of page hitsinclude instances when a user clicks a browser reload (refresh) button,or goes to a bookmarked page, or clicks a back button, or navigates toan entry in a browser history, or follows a link to the subject web pagefrom another web page, or types a URL into the browser's navigationline, and so forth. The commonality is that a human user explicitlyinitiates the action leading to the web page access.

[0018] Not all web-page accesses are page hits. Here, web page accessescaused by automatic refresh activities of browsers such as web browser110 are called “automatic refreshes,” and are counted separately frompage hits. When the client 100 first accesses a web page, the client 100may read information from the web page regarding the rate at which theserver 130 updates the content of the web page, in particular the page'scontent-update period. For example, the server 130 may update thecontent of the web page every 120 seconds, in which case thecontent-update period would be 120 seconds. Using this information, theclient 100 configures the web browser 110 to refresh automaticallyaccording to the content-update period. When the web page issubsequently accessed as a result of an automatic refresh, the serveradvances the automatic-refresh counter 134 rather than the page-hitcounter 132.

[0019] The present invention also encompasses all other equivalent waysof keeping such counts. For example, in another embodiment of theinvention, a first counter may be kept that records the total number oftimes the web page is accessed for any reason, and a second counter maybe kept that records the number of automatic refreshes. In this case,the number of page hits may be computed by subtracting the number ofautomatic refreshes from the total number of times the web page isaccessed. In yet another exemplary embodiment of the invention, thesecond counter may record the number of page hits, in which case thenumber of automatic refreshes may be computed by subtracting the numberof page hits from the total number of times the web page is accessed.These and other similar embodiments fall within the scope of the presentinvention.

[0020]FIG. 2 shows an exemplary way in which the client 100 maydetermine whether a page access under its operation is a page hit or anautomatic refresh. As shown in FIG. 2, the client 100 accesses the URLassociated with the web page in question (step 200). The client 100records the approximate time of the access (step 205) as a timestamp. Ina preferred embodiment of the invention, the browser 110 calculatescurrent epoch time using the getTime( ) JavaScript function in theJavaScript Date class. The timestamp may be recorded as part of a CGIquery string associated with the page access, or within a cookie, orwithin a frameset.

[0021] The browser 110 then determines the content-update period of theweb page (step 210) and appropriately configures its automatic refresh(step 215), for example according to: <META http-equiv=”refresh”content=”120”> or according to: <script language=”JavaScript”> move =setTimeout(“location.href=location.pathname;”, 1200000); </script>

[0022] The client 100 then awaits refresh activity (step 220). Whenrefresh activity is detected, the client 100 reads the previouslyrecorded timestamp (step 230), and determines the present time (step235). The client then analyzes the timestamp, the present time, and thecontent-update period (step 240), and determines whether the refreshactivity is indicative of an automatic refresh (step 245). Detailsregarding steps 240 and 245 of FIG. 2 are discussed below with referenceto FIG. 3. If the refresh activity is indicative of an automaticrefresh, the client 100 instructs the server 130 to provide the web pageand to advance the automatic-refresh counter 134 (step 255), overwritesthe timestamp with the present time (step 260), and returns to awaitfurther refresh activity (step 220). Otherwise (i.e., the refreshactivity is not indicative of an automatic refresh, and is thereforepresumed to be indicative of a page hit), the client 100 instructs theserver 130 to provide the web page and to advance the page-hit counter132 (step 250), overwrites the timestamp with the present time (step260), and returns to await further refresh activity (step 220).

[0023]FIG. 3 shows an exemplary way of distinguishing between anautomatic refresh and activity indicative of a page hit. The distinctionis made by finding the time that has passed since the web page was lastaccessed by the browser 110, and comparing this time to a bound. Thebound may be an endpoint of a tolerance interval that surrounds(includes) the content-update period, where the tolerance intervalaccounts for the various and unpredictable delays encountered inre-loading the web page. For example, if the content update period is120 seconds, the tolerance interval might be between 118 and 145seconds. Then, if the time between the last access of the web page and acurrent access is between 118 and 145 seconds, the current access isindicative of an automatic refresh; otherwise (i.e., the time betweenthe last access of the web page and the current access of the web pageis less than 118 seconds, or greater than 145 seconds), the currentaccess of the web page is activity indicative of a page hit.

[0024] As shown in FIG. 3, the timestamp is subtracted from the presenttime to provide a difference (step 300). The difference is compared withthe tolerance interval (step 310), and a determination is made whetherthe difference falls within the tolerance interval (step 320). If thetolerance falls within the tolerance interval, the client 100 instructsthe server 130 to advance the automatic-refresh counter 134 (step 330);otherwise (i.e., the difference is not within the tolerance interval),the client 100 instructs the server 130 to advance the page-hit counter132 (step 340).

[0025] The client 100 may instruct the server 130 regarding the page-hitcounter 132 and the automatic-refresh counter 134 in a number of ways.These instructions may pass as part of a CGI query string, through acookie, or as part of a framework. A preferred embodiment of the presentinvention passes information via a request for an uncachable GIF imagewith a specific query string, as described by co-pending U.S. patentapplication Ser. No. 09/641,495, to the present Assignee, filed Aug. 18,2000, “Gathering Enriched Web Server Activity Data of Cached WebContent,” the entirety of which is hereby incorporated herein byreference.

[0026]FIG. 4 shows exemplary actions taken by the server 130 in responseto receiving an instruction from the client 100. The server 130 receivesthe instruction from the client 100 (step 400), and provides therequested web page (step 410). The server 130 then further parses theinstruction to determine whether the automatic-refresh counter 134should be advanced (step 420). If the instruction indicates that theautomatic-refresh counter 134 should be advanced, the server 130advances the automatic-refresh counter 134 (step 430). Otherwise (i.e.,the instruction indicates that the page-hit counter 132 should beadvanced, or the instruction does not indicate that a counter should beadvanced), the server advances the page-hit counter 132 (step 440).

[0027] From the preceding description, those skilled in the art will nowappreciate that the present invention provides a more accurate way ofcounting how many time Internet users view a web page. The foregoingdescription is illustrative rather than limiting, however, and theinvention is limited only by the claims that follow.

We claim:
 1. A server for providing a web page, improved by including anautomatic-refresh counter in association with the web page.
 2. Theserver of claim 2, wherein the server advances the automatic-refreshcounter responsive to receiving an instruction from a client.
 3. Amethod for enabling a server that provides a web page to distinguishbetween a page hit by a client and an automatic refresh by the client,comprising the step of receiving an instruction from the clientinstructing the server to advance a page-hit counter.
 4. A method forenabling a server that provides a web page to distinguish between a pagehit by a client and an automatic refresh by the client, comprising thestep of receiving an instruction from the client instructing the serverto advance an automatic-refresh counter.
 5. A method for enabling aserver that provides a web page to distinguish between a page hit by aclient and an automatic refresh by the client, comprising the steps of:receiving an instruction from the client instructing the server toadvance a page-hit counter; and receiving an instruction from the clientinstructing the server to advance an automatic-refresh counter.
 6. Amethod for improving the accuracy with which a count of web page hits iskept by a web page server, comprising the steps of: determining whethera web page access is an automatic refresh; and when the web page accessis determined to be an automatic refresh, instructing the server toadvance an automatic-refresh counter; wherein the steps of determiningand instructing are performed by a client of the server.
 7. A method forproviding information that enables a server to improve the accuracy of acount of the number of times a web page is viewed, comprising: accessinga web page by a client, using a web browser; recording a timestampindicating an approximate time of accessing the web page by the client;determining a content-update period for the web page; configuring thebrowser to refresh automatically according to the content-update period;re-accessing the web page; determining an approximate time ofre-accessing the web page; analyzing the timestamp, the approximate timeof re-accessing the web page, and the content-update period; anddetermining, responsive to an outcome of the step of analyzing, whetherthe step of re-accessing the web page is indicative of an automaticrefresh by the web browser.
 8. The method of claim 7, further comprisinga step of instructing the server to advance an automatic-refreshcounter, responsive to an outcome of the step of determining whether thestep of re-accessing the web page is indicative of an automatic refreshby the web browser.
 9. The method of claim 7, further comprising a stepof instructing the server to advance a page-hit counter, responsive toan outcome of the step of determining whether the step of re-accessingthe web page is indicative of an automatic refresh by the web browser.10. The method of claim 7, wherein the step of analyzing includes thesteps of computing a difference by subtracting the timestamp from theapproximate time of re-accessing the web page, and comparing thedifference to a bound.
 11. The method of claim 10, wherein the bound isan endpoint of a tolerance interval that includes the content-updateperiod.
 12. The method of claim 7, wherein the step of re-accessing theweb page is responsive to an automatic refresh by the web browser. 13.The method of claim 7, wherein the step of re-accessing the web page isresponsive to selection of a bookmark of the browser.
 14. The method ofclaim 7, wherein the step of re-accessing the web page is responsive toselection of a history entry of the browser that refers to the web page.15. The method of claim 7, wherein the step of re-accessing the web pageis responsive to selection of a link to the web page.
 16. The method ofclaim 7, wherein the step of re-accessing the web page is responsive toselection of a reload button of the browser.
 17. The method of claim 7,wherein the step of re-accessing the web page is responsive to manualentry of a URL into a navigation field of the browser.